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Graph-theoretical  invariants  have  come  to  play  an  increasingly  important  role 
in  chemistry  over  the  past  two  decades.  Starting  from  the  chemical  graph  representing 
some  molecular  species,  the  most  frequently  derived  invariants  are  simple  numerical 
descriptors  and  polynomials.  Whereas  the  polynomials  have  been  used  widely  in  the 
study  of  problems  relating  to  chemical  bonding  theory,  the  numerical  invariants  have 
found  major  application  in  the  prediction  of  the  behavior  of  chemical  species.  The 
numerical  descriptors,  known  to  chemists  as  topological  indices,  are  treated  as  inherent 
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properties  of  the  molecules  they  are  employed  to  characterize.  As  such,  they  can 
be  correlated  against  many  other,  experimentally  measured  properties  of  the  molecules. 
It  is  from  correlations  of  this  type  that  predictions  of  the  properties  of  unmeasured 
species  can  be  made.  Topological  indices  enjoy  the  twin  advantages  of  being 
comparatively  easy  to  compute  and  of  yielding  a  result  which  is  free  from 
(experimental)  error.  To  date,  the  molecular  properties  which  topological  indices 
have  been  correlated  with  include  physical,  chemical,  thermodynamic,  biochemical, 
pharmacological  and  toxicological  properties.  Over  100  different  topological  indices 
have  been  advanced  in  the  chemical  literature,  though  only  a  handful  have  so  far 
found  significant  application.  The  results  of  correlative  studies  have  in  general  proved 
to  be  highly  encouraging,  and  lead  us  to  the  conclusion  that  the  use  of  topological 
indices  represents  a  significant  advance  in  the  prediction  of  the  behavior  of  chemical 
substances.  f\c^e^^.r£  r.  •  v 
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THE  ROLE  OF  GRAPH-THEORETICAL  INVARIANTS  IN  CHEMISTRY 


D.H.  Rouvray 

Department  of  Chemistry,  University  of  Georgia,  Athens,  Georgia  30602 


Abstract 


Graph- theoretical  invariants  have  come  to  play  an  increasingly  important 
role  in  chemistry  over  the  past  two  decades.  Starting  from  the  chemical 
graph  representing  some  molecular  species,  the  most  frequently  derived 
invariants  are  simple  numerical  descriptors  and  polynomials.  Whereas  the 
polynomials-  have  been  used  widely  in  the  study  of  problems  relating  to 
chemical  bonding  theory,  the  numerical  invariants  have  found  major 
application  in  the  prediction  of  the  behavior  of  chemical  species.  The 
numerical  descriptors,  known  to  chemists  as  topological  indices,  are  treated 
as  inherent  properties  of  the  molecules  they  are  employed  to  characterize. 
As  such,  they  can  be  correlated  against  many  other,  experimentally  measured 
properties  of  the  molecules.  It  is-  from  correlations-  of  this  type  that 
predictions,  of  the  properties  of  unmeasured  species  can  be  made.  Topological 
indices  enjoy  the  twin  advantages  of  being  comparatively  easy  to  compute 
and  of  yielding  a  result  which  is  free  from  (experimental)  error.  To  date, 
the  molecular  properties  which  topological  indices  have  been  correlated 
with  include  physical,  chemical,  thermodynamic,  biochemical,  pharmacological 
and  toxicological  properties.  Over  100  different  topological  indices  have 
been  advanced  in  the  chemical  literature,  though  only  a  handful  have  so 
far  found  significant  application.  The  results  of  correlative  studies 
have  in  general  proved  to  be  highly  encouraging,  and  lead  us  to  the 
conclusion  that  the  use  of  topological  indices  represents  a  significant 
advance  in  the  prediction  of  the  behavior  of  chemical  substances. 


Chemical  Graph  Invariants 


Graph  invariants  have  been  the  focus  of  much  interest  on  the  part 
of  mathematicians  for  well  over  a  century  [1],  and  many  different  invariants 
have  been  intensively  studied  [2].  In  the  chemical  domain,  graph  Invariants 
have  also  been  used  for  over  a  century,  unwittingly  at  first  and  then  with 
increasing  awareness.  In  fact,  over  the  past  two  decades,  graph  invariants 
may  be  said  to  have  attained  a  position  of  some  prominence  in  chemistry. 
This  accomplishment  is  due  in  large  measure  to  the  now  prevailing  view 
that  graph  invariants  represent  in  the  hands  of  the  chemist  a  valuable 
and  useful  new  tool.  Graph  invariants  are  being  used  principally  for  the 
characterization  of  chemical  graphs.  In  this  context,  the  invariants  are 
regarded  as  properties  of  the  chemical  graphs,  and,  by  a  fairly  natural 
extension,  also  as  properties  of  the  molecules  the  graphs  represent. 
Properties  obtained  by  purely  mathematical  means,  namely  by  applying  an 
appropriate  algorithm  to  the  chemical  graph,  are  usually  treated  like  any 
other  molecular  property.  They  can,  for  instance,  be  correlated  against 
other,  experimentally  determined  properties,  such  as  the  boiling  point, 
melting  point,  or  refractive  index.  Graph  invariants  thus  provide  a 
convenient  means  of  converting  the  structure  of  a  molecule  into  a  parameter 
that  may  be  regarded  as  a  property  of  that  molecule  [3]. 


In  chemical  terms,  the  invariants  are  said  to  characterize  the  molecular 
topology  of  the  molecule  under  consideration,  i.e.  they  yield  a  measure 
of  the  connectedness  of  the  chemical  graph.  The  invariants  themselves 
are  ususally  referred  to  as  topological  indices.  Currently,  over  100 
different  topological  indices  have  been  put  forward  in  the  chemical 
literature  for  the  purpose  of  characterizing  chemical  graphs,  though  to 
date  no  more  than  a  handful  of  these  have  found  major  applications  [4]. 
Some  of  the  indices  are  suitable  for  the  characterization  of  specific  parts 
of  chemical  graphs.  The  ability  to  do  this  is  especially  useful  when  dealing 
with  molecules  which  have  active  sites  or  centers.  Using  regression 
analyses,  indices  have  been  correlated  with  a  wide  range  of  measured 
molecular  properties  [5,10],  including  physical  (e.g.  boiling  point), 
chemical  (e.g.  reactivity),  thermodynamic  (e.g.  heat  of  combustion), 
biochemical  (e.g.  biological  degradability),  pharmacological  (e.g.  anesthetic 
behavior),  physicological  (e.g.  mutagenicity),  and  toxicological  (e.g. 
toxicity)  properties  of  species.  Correlation  coefficients  in  the  region 


of  0.99  are  fairly  commonplace  for  physicochemical  properties,  whereas 
for  the  more  biologically  oriented  properties  a  coefficient  of  around  0.95 
is  usually  regarded  as  excellent. 

The  reason  for  the  extraordinary  range  of  properties  correlated  and 
predicted  using  topological  indices  is  that  the  indices  are  reflecting 
a  very  fundamental  molecular  parameter.  This  parameter  is  the  connectedness 
of  the  graph  or,  in  chemical  terminology,  the  topology  of  the  molecule. 
In  a  large  number  of  cases,  it  is  this  parameter  which  appears  to  govern 
molecular  behavior.  Moreover,  it  is  now  well-known  [11]  that  topological 
indices  provide  a  measure  of  both  the  size  and  shape  of  the  molecules  they 
represent.  By  size  in  the  present  context  is  meant  the  volume  of  the 
molecule,  and  by  shape  we  understand  the  distribution  of  that  volume  in 
3-space.  Although  the  physicochemical  properties  of  molecular  species 
are  frequently  largely  determined  by  these  factors,  biological  properties 
are  in  general  dependent  on  several  different  factors.  This  makes  biological 
properties  more  difficult  to  correlate  using  only  topological  indices; 
the  generally  lower  correlation  coefficients  for  a  number  of  biological 
properties  bear  witness  to  this  fact.  Many  biological  responses  are 
triggered  by  the  interaction  of  specific  parts  of  a  molecule  with  some 
biological  receptor.  In  such  cases,  only  the  part  of  the  molecule  actually 
interacting  —  rather  than  the  molecule  as  a  whole  —  is  of  prime  importance. 
Thus,  indices  which  characterize  specific  molecular  sites  are  of  especial 
value  for  a  number  of  biological  correlations. 

In  general,  topological  indices  work  best  when  they  are  employed  for 
correlations  of  related  series  of  molecules,  e.g.  for  the  members  of 
homologous  series.  Within  such  series,  equivalent  chemical  bonds  in  the 
various  members  are  more  or  less  identical  in  the  sense  that  they  possess 
closely  smilar  force  constants.  Whenever  this  condition' holds,  the  members 
of  the  series  are  said  to  exhibit  bond  transferability.  Much  of  the  early 
correlative  work  was  carried  out  on  homologous  series,  such  as  the  alkane 
(CnH2n+2^  series  [12-14].  More  recently,  correlations  have  been  obtained 
for  a  variety  of  other  series,  such  as  those  based  on  nitrogen-containing 
heterocyclic  compounds  [15].  For  the  indices  to  function  as  molecular 
descriptors,  they  must  be  able  to  provide  characterizations  of  many  different 
classes  of  graph  of  chemical  interest.  Thus,  they  must  be  able  to  describe 
trees  and  especially  the  extent  of  branching  in  them,  as  well  as  reflect 


the  presence  of  multiple  edges  (multiple  bonds),  cycles  (rings),  and  weighted 
edges  (heteroatoms).  Ideally,  topological  indices  should  yield  unique 
characterizations  of  these  graphs.  Although  there  is  no  unique  index 
currently  available  —  and  the  discovery  of  such  an  index  seems  highly 
unlikely,  --  a  number  of  them  are  highly  discriminating  for  classes  of 
graphs  of  chemical  interest  [16].  The  nonuniqueness  is  not  as  troublesome 
to  chemists  as  it  might  seem,  since  nonisomorphic  graphs  possessed  of 
identical  indices  are  frequently  associated  with  molecules  displaying  roughly 
comparable  properties. 

The  Wiener  Index 

The  first  of  the  topological  indices,  put  forward  by  Wiener  [12]  in 
1947,  is  known  to  reflect  principally  the  size  of  a  molecules,  though  some 
allowance  is  also  made  for  its  shape  [17].  The  Wiener  index,  W(G),  is 
defined  as  one  half  the  sum  of  the  elements  in  the  distance  matrix  of  the 
chemical  graph  in  question,  i.e. 

n  n 

W(G)  hi  l  d..  ,  (1) 
i- 1  j-l  tJ 

where  n  is  the  number  of  non-hydrogen  atoms  in  the  chemical  graph.  Hydrogen 
atoms  are  usually  excluded  from  chemical  graphs  as  they  are  not 
structure-determining;  they  can  readily  be  added  in  if  and  when  required. 
Several  formulas  for  calculating  W(G)  values  have  been  cited  by  Rouvray 
[18]  j  a  closed,  analytical  expression  for  the  general  tree  was  recently 
published  by  Canfield  et  al.  [19].  The  index  has  been  used  primarily 
in  correlating  a  wide  range  of  the  physicochemical  properties  of  hydrocarbon 
molecules  [14],  To  a  very  limited  extent,  the  index  has  also  found 
application  in  the  more  biologically  oriented  sciences  [20].  As  typical 
examples  of  the  type  of  correlations  obtained,  we  show  in  Figure  1 
correlations  on  linear  and  logarithmic  scales  for  the  boiling  points  of 
unbranched  alkane  (CnH2n+2)  molecules.  Note  that,  even  on  the  logarithmic 
scale,  the  plot  is  not  a  straight  line  (vide  infra). 

Some  of  the  greatest  successes  of  the  Wiener  index  have  been  achieved 
with  very  large  systems,  such  as  polymeric  or  solid  state  systems.  One 
approach,  due  to  Mekenyan  et  al.  [21],  made  use  of  a  normalized  Wiener 
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Figure  2.  Closed  expressions  for  the  values  assumed  by  the  Wiener  index, 
W(G),  in  various  monomeric  units. 

index  which  took  finite  values  for  infinite  chains  of  monomeric  units. 
Closed  expressions  for  the  Wiener  index  of  a  number  of  such  chains  are 
shown  in  Figure  2.  The  indices  were  normalized  by  dividing  by  a  polynomial 
of  the  same  degree.  Substitution  of  the  normalized  W(G)  values  into 
appropriate  regression  equations  afforded  good  estimates  of  the  properties 
of  the  Infinite  chains  [21].  Moreover,  Rouvray  and  Pandey  [22]  have 
demonstrated  that  W(G)  can  provide  useful  information  on  the  mean 
configuration  adopted  by  long  alkane  molecules  at  their  boiling  point. 
Using  the  concept  of  fractal  dimensionality,  they  were  able  to  show  that 
the  ratios  of  slopes  in  the  logarithmic  plot  in  Figure  1  should  tend  to 
the  limit  value  of  0.6  for  unbranched  chains  of  infinite  length.  As  this 
limit  is  indeed  approached  in  practice,  it  becomes  possible  to  estimate 
the  mean  configuration  of  the  molecules  [15].  In  solid  state  lattices, 
favored  vacancy  positions  can  be  determined  by  calculating  differences 


mu 


in  W(G)  for  Che  strucCure  under  study  and  some  ideal  reference  structure 
[23].  Minimization  of  these  differences  leads  to  the  structure  actually 
adopted.  This  work  could  have  important  applications  in  the  modeling  of 
crystal  growth  processes,  gas  absorption  in  solids,  catalytic  reactions, 
and  the  optimal  positioning  of  vacancies  in  crystal  lattices. 


After  Wiener's  early  work,  the  next  major  advance  came  in  1971  when 
Hosoya  [24]  put  forward  his  index.  The  index  is  defined  by  the  equation: 

[n/2] 

Z(G)  -  £  p(G ,  k  ) ,  (2) 

k=Q 

where  p(G,  k  )  is  the  number  of  ways  in  which  k  disconnected  R2  graphs 
can  be  imbedded  in  the  chemical  graph  G  as  subgraphs,  and  [n/2]  represents 
the  maximal  value  assumed  by  k  .  From  the  definition  it  follows  that  p(G,0) 
will  be  unity,  p(G,l)  the  number  of  edges  in  G,  and  p(G,n/2)  the  number 
of  1-factors  (Kekul4  structures)  in  G.  The  index  has  been  used  primarily 
to  model  the  properties  of  hydrocarbon  species.  A  number  of  empirical 
rules  have  been  advanced  [25]  which  prescribe  the  extent  of  the  lowering 
of  the  boiling  point  with  increasing  branching  in  a  variety  of  alkane 
species.  Gutman  [26]  has  also  shown  that  Z(G)  is  suitable  for  modeling 
the  well-known  alternations  in  the  boiling  points  of  substituted  alkane 
species.  Finally,  it  has  been  demonstrated  [27]  that  log  Z(G)  values  can 
model  both  the  diminution  of  the  rotational  degree  of  freedom  of  molecules 
associated  with  increasing  branching,  and  the  decrease  in  the  partition 
function  arising  from  overcrowded  molecular  conformations. 


Hosoya  [28]  also  introduced  the  Z(G)  polynomial  defined  as  follows: 

[n/2] 

P<G;x)  -  £  p (G.fc)  x<  .  (3) 

fc-0 

Both  the  Hosoya  index  and  polynomial  are  closely  related  to  several  other 
graph  invariants  in  chemistry.  For  instance,  Z(G)  is  linked  with  the 
characteristic  polynomial,  C(x),  of  a  given  graph.  In  the  case  of  trees, 
the  relationship  assumes  the  form  [24,29]: 


Cf(x) 


(4) 


[a/2] 

l  (-1)*  p(T,fc)xn_2fe  , 

k-0 

whereas  for  cyclic  graphs  some  additional  terms  appear  in  equation  (4). 
The  characteristic  polynomial  yields  via  its  eigenvalues  the  energy  levels 
of  the  molecule  under  study  [30].  Moreover,  the  values  assumed  by  Z(G) 
for  linear  molecules  (path  graphs)  form  members  of  the  Fibonacci  series 
while  corresponding  values  for  monocycles  form  members  of  the  Lucas  series. 
Broadly  speaking,  polynomial  invariants  have  been  employed  in  chemistry 
for  three  major  purposes,  viz.  (i)  study  of  the  bonding  in  aromatic  and 
other  organic  molecules  based  on  simple  one-electron  mod.els;  (ii)  correlation 
of  a  variety  of  physicochemical  properties  of  molecules;  and  (iii) 
investigation  of  the  extent  of  branching  in  molecular  species  and  the 
chemical  consequences  that  may  be  drawn  therefrom.  For  a  full  account 
of  the  manifold  uses  of  polynomials  in  chemistry,  the  interested  reader 
is  referred  to  the  review  by  Gutman  [31]. 


Molecular  Connectivity  Indices 


The  most  successful  topological  index  published  to  date  in  terms  of 
its  number  of  applications  is  the  molecular  connectivity  index  of  Randi£. 
The  index  was  put  forward  in  1975  and  was  originally  intended  to  characterize 
the  branching  in  alkane  (CnH2n+2^  species  [32].  More  recently,  it  has 
been  shown  to  have  numerous  applications  in  both  the  physical  and  biological 
sciences  [5,9,10].  The  index  is  based  on  the  notion  of  edge  types  in 
chemical  graphs  and  is  defined  by  the  general  relation: 


X  (G) 


£  (v^  v  .  )_is, 

edges  ^ 


(5) 


where  the  summation  extends  over  all  the  edges  of  G,  and  v-  and  v  •  are 
the  degrees  of  a  pair  of  neighboring  vertices  connected  by  the  edge  {  i,J  }  . 
This  index  is  now  more  correctly  referred  to  as  *  x  (G),  since  its  early 
successes  led  to  the  introduction  of  a  whole  range  of  other  X  (G)  indices, 
now  designated  as  °X(G),  ^-XCG),  2X(G),  etc.  for  paths  of  length  zero, 

one,  two,  etc.  The  generalized  index,  ^X  (G),  may  thus  be  defined  for 
trees  by  the  equation: 


b  X  (G)  *  I  (v^(ir  )v£(  tt  )  •••  Vh+].(  t  )  )~'ii  (6) 

where  it  extends  over  all  paths  of  length  h»  and  v.£  (  it  )  denotes  the  valence 
of  the  tth  vertex  on  path  it,  with  1  _<_  £  _<  h+1.  The  b  x(G)  index  is,  of 
course,  also  derivable  by  exponentiation  to  the  power  h  of  the  adjacency 
matrix  of  G.  Indices  with  differing  h  values  will  assign  different  weightings 
to  the  contributions  made  by  primary  (CH3),  secondary  (CH2),  tertiary  (CH), 
or  quaternary  (C)  carbon  atoms  in  hydrocarbon  and  other  species. 


Figure  3.  Illustration  of  the  various  types  of  subgraph  used  iu  calculating 
Randi£  molecular  connectivity  indices. 

As  part  of  the  generalization  process,  the  molecular  connectivity 
index  was  defined  for  a  variety  of  subgraphs  of  G  other  than  edges.  The 
subgraphs  chosen  for  the  definition  of  G  are  now  referred  to  as  paths, 
clusters,  path/clusters,  and  chains;  these  are  all  illustrated  in  Figure 
3.  By  allowing  for  such  subgraphs,  the  most  general  form  of  the  index 
may  be  written  as: 

.  nh  h+1  ,  ' 

bxt(G)  -  I  n  (7) 

fc-1  1*1 


where  h  is  the  number  of  edges  in  the  subgraphs  of  G  used  to  calculate 


Che  index,  t  is  Che  Cype  of  subgraph  used  (Figure  3),  nh  is  the  number 
of  subgraphs  of  Cype  C  having  h  edges,  and  Che  index  k  exCends  over  all 
Che  nh  subgraphs.  As  iC  is  of  greac  importance  Co  be  able  Co  deal  vich 
molecular  species  conCaining  so-called  heCeroaComs,  i.e.  atoms  other  chan 
carbon  or  hydrogen,  a  further  generalization  of  Che  index  was  made  by  Kier 
and  Hall  [33].  The  valencies  of  heteroatoms  were  assigned  on  Che  basis 
of  their  electronic  charge;  this  led  Co  an  appropriate  weighting  of  Che 
graph  vertices.  Molecules  conCaining  any  kinds  of  atom  can  thus  be  Created 
by  means  of  molecular  connectivity  indices. 

To  specifiy  even  Che  principal  applications  of  molecular  connectivty 
indices  would  be  a  substantial  undertaking.  The  authors  of  well  over  100 
papers  have  made  use  of  these  indices  for  the  correlation  and  prediction 
of  a  very  wide  variety  of  physical,  chemical,  and  biological  properties. 
In  the  physical  context  mention  may  be  made  of  properties  such  as  the  boiling 
point,  solubility,  density,  heat  of  vaporization,  and  partition  coefficient. 
In  the  more  chemical  realm,  the  properties  correlated  include  the  heat 
of  combustion,  heat  of  formation,  chromatographic  retention  time,  taste, 
and  soil  sorption.  Examples  from  the  biological  sphere  include  narcotic 
activity,  mutagenicity,  toxicity,  biodegradability,  and  bioconcentration 
factors.  Reviews  covering  the  principal  applications  of  these  indices 
have  been  published  by  Kier  and  Hall  [9],  Sabljic  and  Trinajstic  [5],  and 
Rouvray  [10].  The  surprisingly  good  correlations  obtained  with  numerous 
biological  properties  open  up  the  possibility  of  employing  the  indices 
in  several  novel  roles  embracing  areas  such  as  the  design  of  new  drugs, 
and  the  prediction  of  the  toxicity  of  new  chemical  substances  --  which 
are  currently  being  generated  at  the  rate  of  several  hundred  thousand  per 
annum  —  before  they  are  actually  synthesized. 

Special  Purpose  Indices 


A  number  of  topological  indices  have  been  developed  for  the  purpose 
of  resolving  special  problems.  Such  indices  are  usually  designed  with 
a  specific  purpose  in  mind.  An  example  of  this  type  of  index  is  provided 
by  those  indices  used  to  characterize  a  given  portion  of  a  molecule  rather 
than  the  molecule  as  a  whole.  This  means  in  graph-theoretical  terms 
characterizing  certain  of  the  vertices  in  the  chemical  graph  of  the  molecule. 
These  indices  have  proven  to  be  very  valuable  in  the  investigation  of 


physical  and  chemical  interactions  in  which  one  particular  site  in  a  molecule 
is  highly  active  and  the  rest  of  the  molecule  is  relatively  inactive.  This 
is  normally  the  case  when  a  drug  molecule  interacts  with  a  biological 
receptor  in  a  living  organism  to  produce  some  biological  response.  Only 
one  part  of  the  molecule  is  capable  of  fitting  into  the  receptor  and 
triggering  the  appropriate  response.  We  now  discuss  two  examples  of  such 
special  purpose  indices.  One  of  these  was  introduced  to  describe  the  extent 
of  branching  present  in  alkane  species  (which  have  tree  graphs  of  maximal 
valence  four) ,  and  the  other  to  characterize  each  of  the  individual  atoms 
in  polycyclic  aromatic  hydrocarbons  (which  have  polyhex  graphs). 


The  Balaban  Centric  Index 


4 


2 


C(G)=  42  +  22  +  2Z  =24 

Figure  4.  Illustration  of  the  derivation  of  the  Balaban  centric  index 
for  the  graph  of  the  molecule  of  2,4-dimethylhexane. 

The  first  special  purpose  index  was  introduced  to  characterize  certain 
alkane  molecules  commonly  used  as  fuel  molecules  in  internal  combustion 
engines.  These  molecules  have  been  assigned  octane  number  ratings  which 
provide  a  measure  of  the  anti-knock  characteristics  of  the  fuel  in  question. 
Octane  numbers  are  known  to  depend  on  the  amount  of  branching  present  in 
the  molecule;  the  more  branched  a  given  fuel  molecules  is,  the  less  likely 


it  will  be  to  self-ignite  or  knock  upon  sudden  compression  in  air. 
Therefore,  the  more  branching  present  in  the  molecule  the  better  it  will 
function  as  a  fuel.  To  quantify  the  extent  of  branching  present,  Balaban 
and  Mo£oc  [33,34]  made  use  of  the  concept  of  normalized  centric  Indices. 
The  index  used  in  their  study  can  be  defined  as: 

B(G)  -  HI  I  &2  -  2n  -I*  is(l  -  (-l)n)],  (8) 

i 

where  n  is  the  number  of  carbon  atoms  in  the  molecule.  The  summation  of 
the  $•£  2  terms  represents  the  centric  index  for  the  molecule  and  the 
remaining  terms  in  equation  (8)  represent  the  value  of  the  index  for  a 
path  graph  on  p  vertices.  By  subtracting  the  two  latter  terms  from  the 
summation,  the  index  is  said  to  be  normalized.  The  &£  terms  are  obtained 
by  successively  pruning  from  the  chemical  graph  all  vertices  of  degree 
one.  The  £-£  terms  are  then  squared  and  added  to  yield  the  index,  as 
illustrated  in  Figure  4.  In  linear  regression  analyses  with  their  octane 
numbers,  the  centric  indices  for  various  isomeric  heptane  (C7H15)  and  octane 
(C8Hi8)  species  yielded  high  correlation  coefficients  (on  the  order  of 
0.98). 


Figure  5.  Illustration  of  the  graph  of  the  molecule  of  benz [a] anthracene 
showing  active  and  inactive  regions. 


The  second  special  purpose  index  we  discuss  was  originally  Introduced 
by  Randi£  [35]  and  more  recently  adapted  by  Seybold  [36]  to  the  study  of 


molecules  having  polyhex  graphs.  Such  molecules  are  referred  to  by  chemists 
as  arenes  or  polycyclic  aromatic  hydrocarbons.  It  has  been  known  for  over 
50  years  that  a  number  of  these  molecules  are  carcinogenic,  i.e.  they  cause 
cancerous  lesions  in  experimental  animals  and  humans.  The  question  was 
posed  whether  the  extent  of  the  carcinogenicity  of  these  molecules  could 
be  correlated  with  a  specialized  index.  Early  on  it  was  realized  that 
certain  parts  of  these  molecules  were  active  and  others  relatively  inactive. 
The  molecules  were  accordingly  divided  up  into  various  regions,  as  shown 
in  Figure  5.  For  such  a  molecule  to  be  carcinogenic,  the  K  and  bay  regions 
need  to  be  active  while  the  L  region  must  be  inactive.  The  index  employed 
is  now  referred  to  as  the  atomic  index  as  it  characterizes  each  of  the 
atoms  in  the  molecule.  The  index  is  defined  by  the  relationship: 


S(G)  -  i  d (9) 

J-l 


and  thus  equals  the  sum  of  the  shortest  paths  from  every  vertex  in  the 
graph  to  vertex  i  .  This  sum,  of  course,  is  obtained  by  summing  the  d{,j 
elements  of  the  distance  matrix,  D(G) ,  for  the  graph  in  question  along 

either  the  fth  row  or  column.  The  index  provides  a  measure  of  the 

connectedness  of  each  atom  in  the  molecule,  a  low  value  of  S(G)  indicating 
a  more  connected  atom.  The  more  connected  an  atom  in  a  molecule,  the  easier 
it  will  be  for  electrons  to  flow  to  and  from  that  atom  and  hence  the  more 
likely  it  is  to  be  a  reactive  atom.  This  fact  is  confirmed  in  the  case 
of  the  molecule  of  benz [a] anthracene  shown  in  Figure  5.  The  S(G)  indices 
have  been  calculated  for  atoms  of  interest,  and  it  is  evident  that  those 
with  the  lowest  values  of  the  index  are  precisely  those  that  fall  within 
the  active  K  and  bay  regions.  Thus,  even  very  simple  indices,  such  as 
S(G),  can  often  provide  remarkably  penetrating  insights  into  the  inner 

workings  of  complex  phenomena,  such  as  carcinogenesis.  By  means  of  this 
index,  Seybold  [36]  was  able  to  predict  which  arene  molecules  would  be 

carcinogenic  in  experimental  animals,  and  also  obtain  reliable  estimates 
of  the  degree  of  carcinogencity  of  each  of  them. 
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